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Abstract 

We show that the Kolmogorov-Chaitin complexity K(s) of a 
string s numerically approximated using Levin's coding theorem with 
the associated measure K m (s) (a method we have therefore named 
the Coding Theorem Method) as calculated from the frequency of 
production of a large set of small deterministic Turing machines 
with up to 5 states (and 2 symbols) correlates with the number of 
instructions used by the Turing machines, in agreement with strict 
integer-value program-size complexity. Nevertheless, K m proves to be 
a finer-grained measure and a better approach for distinguishing K for 
short strings from non-integer evaluations. We also show that neither 
K m nor the number of instructions used suggests any correlation with 
Bennett's concept of Logical Depth (the shortest runtime of the short- 
est computer program producing s). And we announce a first version 
of an online complexity calculator based on a combination of the- 
oretical concepts as an implementation of our Coding Theorem Method. 

Keywords: Coding Theorem Method; Kolmogorov-Chaitin com- 
plexity; Solomonoff-Levin algorithmic probability; Levin's universal 
distribution; Levin-Chaitin coding theorem; Program-size complexity; 
small Turing machines. 
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1 Introduction 



Kolmogorov complexity (also known as Kolmogorov-Chaitin, algorithmic or 
program-size complexity) is recognized as a fundamental concept, but it is 
also often thought of as having little or no applicability because it is not 
possible to provide stable numerical approximations for finite-particularly 
short-strings by using the traditional approach, namely lossless compression 
algorithms. We advance a method that can overcome this limitation, and 
which even if limited in ways both theoretical and practical, nonetheless 
offers a means of providing sensible values for the complexity of short strings, 
complementing the traditional lossless compression method that works well 
for long strings. This is done at the cost of massive calculations in order 
to use the coding theorem from algorithmic probability that relates the 
frequency of production of a string with its Kolmogorov complexity. 

Bennett's logical depth, on the other hand, is a measure of the complex- 
ity of strings that, unlike Kolmogorov complexity, measures the organized 
information content of a string and not its random incompressible complex- 
ity. To calculate the non-computable is always a challenge, and it can of 
course only be achieved partially, but Bennett's logical depth is even more 
difficult to calculate given its own particularities. This approach, however, 
represents a first numerical attempt to provide exact calculations for a fixed 
formalism. 

The independence of the two measures, that is, Kolmogorov complexity 
and Logical depth (which has been established theoretically), is numerically 
tested and reported in this paper. Our work is in agreement with what the 
theory predicts, even for short strings and despite the limitations of this 
approach. The ability to apply these concepts to practical problems (estab- 
lished in a series of articles (see [13, 22])) is novel, and they should prove to 
have many applications where evaluations of the complexity of finite short 
strings are needed. Their practical numerical approximations suggest that 
our Coding Theorem Method is sound and of potential use where compres- 
sion algorithms — the method traditionally used — fail to provide any sensible 
approximation to K. 

In Sections 2, 3, 6 and 5 we introduce the measures, tools and formalism 
used in Section 7 to advance our Coding Theorem Method to approximate 
the Kolmogorov complexity of short strings. Finally in Section 7, we report 
the results of the analysis of the comparison among the various measures 
calculated by the method, particularly the complexity by number of instruc- 
tions and the logical depth. 
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2 Kolmogorov-Chaitin complexity 



When researchers have chosen to apply the theory of algorithmic informa- 
tion (AIT), which in principle is not supposed to be of any practical use 
[7], it has proven to be of great value, for example, for DNA false positive 
repeat sequence detection in genetic sequence analysis [19], in distance mea- 
sures and classification methods [8], and in numerous other applications [18]. 
This effort has, however, been hamstrung by the limitations of compression 
algorithms-currently the only method used to approximate the Kolmogorov 
complexity of a string-given that this measure is not computable. 

Central to AIT is the definition of algorithmic (Kolmogorov-Chaitin or 
program-size) complexity [16, 6]: 

K T (s) = min{|p|,T(p) = s} (1) 

That is, the length of the shortest program p that outputs the string s 
running on a universal Turing machine T. A technical inconvenience of K 
as a function taking s to the length of the shortest program that produces 
s is its uncomputability. In other words, there is no program which takes a 
string s as input and produces the integer K(s) as output. This is usually 
considered a major problem, but one ought to expect a universal measure 
of complexity to have such a downside. 

The measure was first conceived to define randomness and is today the 
accepted objective mathematical measure of finite randomness, among other 
reasons because it has been proven to be mathematically robust (by virtue 
of the fact that several independent definitions converge in it). 

A classic example is a string composed of an alternation of bits, such as 
(01) n , that can be described as "n repetitions of 01". The string can grow 
fast while the description will only grow by about log 2 (ra). On the contrary, 
a random-looking string such as 011001011010110101 may not have a much 
shorter description than itself. 

Traditionally, the way to approach the algorithmic complexity of a string 
has been by using lossless compression algorithms. The result of a loss- 
less compression algorithm is an upper bound of its algorithmic complexity. 
Short strings, however, are difficult to compress in practice, and the theory 
does not provide a satisfactory solution to the problem of the instability of 
the measure for short strings. 

The invariance theorem guarantees that complexity values will only 
diverge by a constant c (e.g. the length of a compiler, a translation program 
between Ui and U2) and that hey will converge at the limit. 
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Invariance Theorem ([4, 18]): If U% and XJ% are two universal Turing 
machines and Ky x {s) and Ku 2 (s) the algorithmic complexity of s for U\ 
and U2, there exists a constant c such that: 



Hence the longer the string, the less important c is (i.e. the choice of 
programming language or universal Turing machine). However, in practice 
c can be arbitrarily large, thus having a very great impact on short strings. 

3 Bennett's Logical Depth 

The concept of Kolmogorov-Chaitin complexity formalizes the concepts of 
simplicity and randomness by means of information. As mentioned before, 
several applications based on K using compression algorithms have been 
successfully developed to date. None, however, seems to have exploited the 
concept of logical depth, with the exception of a previous paper of ours [22], 
which did so with encouraging results. A measure of the complexity of a 
string can be arrived at by combining the notions of algorithmic information 
content and time complexity. According to the concept of logical depth [1,2], 
the complexity of a string is best defined by the time that an unfolding 
process takes to reproduce the string from its shortest description. The 
longer it takes, the more complex. Hence complex objects are those which 
can be seen as "containing internal evidence of a nontrivial causal history." 
The concept of logical depth takes into account the plausible history of an 
object as an unfolding phenomenon. It combines the concept of the shortest 
possible description of an object with the time that it takes to evolve to the 
state it is in at any given moment. 

A typical example that illustrates the concept of logical depth and its 
characterization as a measure of physical complexity is a sequence of fair 
coin tosses. Such a sequence would have a high information content (al- 
gorithmic complexity) because the outcomes are random, but little value 
(logical depth) because they are easily generated and carry no message, no 
meaning. The string 1111. ..1111 is also logically shallow. Its minimal pro- 
gram, while very small, requires little time to evaluate. In contrast, the 
binary representation of the number ir is not shallow, because although it 
is highly compressible (by any known formula producing ir), the generating 
algorithms require computational time to produce several digits of its ex- 
pansion. A better example is Chaitin's number [6], the digits of which 
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encode the halting probability of a universal Turing machine, and which is 
known to be very deep since no computable process can expand except 
for a finite number of digits [4, 14]. 

Bennett offers a careful development [1] of the notion of logical depth, 
taking into account near-shortest programs as well as the shortest one, hence 
the significance value for a reasonably robust and machine-independent mea- 
sure. Algorithmic complexity and logical depth are intimately related the- 
oretically, because logical depth requires an approximation of algorithmic 
complexity. But they are also supposed to be different measures. Unlike 
algorithmic complexity, which assigns a high complexity to both random 
and highly organized objects, placing them at the same level, logical depth 
assigns a low complexity to both random and trivial objects. It is thus more 
in keeping with our intuitive sense of the complexity of physical objects, be- 
cause trivial and random objects are intuitively easy to produce, do not have 
a long history, and unfold quickly. A clear, detailed explanation pointing 
out the convenience of the concept of logical depth as a measure of organized 
complexity as compared to plain algorithmic complexity, which is what is 
usually used, is provided in [10]. We show that it is the case that these 
measures measure different things and that they accord with our intuitive 
sense of what each is supposed to measure (randomness versus simplicity in 
the case of Kolmogorov complexity, and structure versus randomness and 
simplicity in the case of logical depth) . 

For finite strings, one of Bennett's formal approaches to the logical 
depth of a string is defined as follows: 

Let s be a string and d a significance parameter. A string's depth at 
significance d is given by 

LD d {s) = min{T(p) : {\p\ - \p'\ < d) and (U(p) = s)} (3) 

with \p'\ the length of the shortest program for s, (therefore K(s)). In 
other words, LD^s) is the least time T required to compute s from a d- 
incompressible program p on a Turing machine U. 

Each of the three linked definitions of logical depth provided in [1] comes 
closer to a definition in which near-shortest programs are taken into consid- 
eration. In this experimental approach we make no such distinction among 
significance parameters, so we will denote the logical depth of a string s 
simply by LD(s). 

Like K(s), LD(s) as a function of s is uncomputable. A novel feature of 
this research is that we were able to provide exact approximations of logical 
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depth. This was achieved by running Turing machines sorted by increasing 
size and finding the smallest and fastest machine from among a relatively 
large sample of Turing machines that produced a given string. 

4 Solomonoff-Levin Algorithmic Probability 

The algorithmic probability (also known as Levin's semi-measure) of a string 
s is a measure that describes the expected probability of a random program 
p running on a universal (prefix- free 1 ) Turing machine T producing s. For- 
mally [21, 17, 6], 

m(s) = E psT(p)= .l/2W (4) 

i.e. the sum over all the programs for which T with p outputs s and halts. 

Levin's semi-measure 2 m(s) defines a distribution known as the 
Universal Distribution [15]. It is important to notice that the value of 
m(s) is dominated by the length of the smallest program p (when the 
denominator is larger). The length of the smallest p that produces the 
string s is, however, K(s). The semi- measure m(s) is therefore also 
uncomputable, because for every s, m(s) requires the calculation of 
2~ K ^ S \ involving K, which is itself uncomputable. An alternative [13] to 
the traditional use of compression algorithms is the use of the concept of 
algorithmic probability to calculate K (s) by means of the following theorem. 

Coding Theorem (Levin [17]): 

| — log 2 m(s) — K(s)\ < c (5) 

An informal interpretation is that if a string has many long descriptions 
it also has a short one. It beautifully connects frequency to complexity, 
more specifically the frequency (or probability) of occurrence of a string 
with its algorithmic (Kolmogorov) complexity. The coding theorem implies 
that [9, 4] one can calculate the Kolmogorov complexity of a string from its 
frequency [12, 11, 23, 13], simply rewriting the formula as: 

K m {s) = -log 2 m{s) + 0(l) (6) 

1 The group of valid programs forms a prefix- free set (no element is a prefix of any 
other, a property necessary to keep < m(s) < 1.) For details see [4]). 

2 It is called a semi measure because the sum is never 1, unlike probability measures. 
This is due to the Turing machines that never halt. 
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An important property of m as a semi-measure is that it dominates any 
other effective semi-measure /i, because there is a constant such that 
for all s, m(s) > CafJ>(s). For this reason m(s) is often called a universal 
distribution [15]. 

5 Deterministic Turing machines 

The ability of a universal Turing machine to simulate any algorithmic pro- 
cess 3 has motivated and justified the use of universal Turing machines as the 
language framework within which definitions and properties of mathematical 
objects are given and studied. 

However, it is important to describe the formalism of a Turing machine, 
because exact values of algorithmic probability for short strings will be 
approximated under this model, both for K(s) through m(s) (denoted by 
K m ), and for K(s) in terms of the number of instructions used by the 
smallest Turing machine producing s. 

Consider a Turing machine with the binary alphabet S = {0, 1} and n 
states {1, 2, ... n} and an additional Halt state denoted by (as defined by 
Rado in his original Busy Beaver paper [20]). 

The machine runs on a 2-way unbounded tape. At each step: 

1. the machine's current "state" (instruction); and 

2. the tape symbol the machine's head is scanning 

define each of the following: 

1. a unique symbol to write (the machine can overwrite a 1 on a 0, a 
on a 1, a 1 on a 1, and a on a 0); 

2. a direction to move in: —1 (left), 1 (right) or (none, when halting); 
and 

3. a state to transition into (which may be the same as the one it was 
in). 

The machine halts if and when it reaches the special halt state 0. There 
are (4n + 2) 2n Turing machines with n states and 2 symbols according to 
the formalism described above. The output string is taken from the number 

3 Under Church's hypothesis. 
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of contiguous cells on the tape the head of the halting n-state machine has 
gone through. A Turing machine is considered to produce an output string 
only if it halts. The output is what the machine has written on the tape. 

6 The Coding Theorem Method 

One can attempt to approximate m(s) by running every Turing machine 
following a particular enumeration. A natural one is a quasi-lexicographical 
ordering, from shorter to longer by number of states and symbols. Let (n, m) 
be the set of Turing machines with n states and m symbols. It is clear that 
in this fashion once a machine produces s for the first time, one can directly 
calculate an exact value of K. Because this is the length of the first Turing 
machine in the enumeration of programs of increasing size that produces s, 
there is no shorter machine producing s, and from Turing universality we 
know there is a machine T G (n, m) that produces s. 

Let's now formalize a function D(n,m) as was previously done in [13], 
as an approximation of m(s) for binary strings as follows: 



Where T(p) is the Turing machine with number p (and empty input) that 
produces s upon halting, and \A\ is, in this case, the cardinality of the set A. 
We have previously proved [23, 13] that the function (n,m) — > D(n,m) is 
non-computable by reduction to the halting problem. However, D(n, m) is 
lower semi-computable, meaning it can be computably approximated from 
below, for example, by running small Turing machines for which values of 
the Busy Beaver problem [20] are known. For example, for n = 4, the Busy 
Beaver function for maximum runtime S, tells us that S'(4, 2) = 107 [3], so 
we know that a machine running on a blank tape will never halt if it hasn't 
halted after 107 steps, and we can therefore stop it manually. 

Previously [23, 13] we had calculated the full output distribution of Tur- 
ing machines with 2-symbols and n = 1,...,4 states for which the Busy 
Beaver values are known, in order to determine the halting time. That is a 
total of 36, 10 000, 7 529 536 and 11 019 960 576 Turing machines respectively. 

Because there are a large enough number of machines to run even for a 
small number of machine states (n), applying the coding theorem provides 
a fine and increasingly stable (due to the invariance theorem) evaluation 
of K(s) based on the frequency of production of a large number of Tur- 
ing machines. But the number of Turing machines grows exponentially, 
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and producing D(5, 2) requires considerable computational resources. Cal- 
culating D(5,2) is an improvement on our previous numerical evaluations 
and provides a larger data set to work with and to draw more significant 
statistical conclusions from for the purposes of this research. There are 
26 559 922 791424 Turing machines with 5 states and 2 symbols, and the 
values of Busy Beaver functions for these machines are unknown. In what 
follows we describe how we proceeded. From now on, D(n) with a single 
parameter will mean D(n, 2). 

6.1 Reduction techniques 

We did not run all the Turing machines with 5 states to produce -D(5) 
because one can take advantage of symmetries and anticipate some of the 
behavior of the Turing machines directly from their transition tables with- 
out actually running them (this is impossible in general due to the halting 
problem). We avoided some trivial machines whose results we know with- 
out having to run them. For example, machines with the initial transition 
moving to the halting state produce strings "0" and "1". It's easy to quan- 
tify these machines and avoid running them, saving the time that would be 
used generating trivial machines. Also, machines with the initial transition 
staying in the initial state will not halt (as they run on a blank tape); they 
always remain in the initial state. So we are interested in machines with 
the initial transition moving to a state different from both the initial and 
the halting states. Moreover, we can exploit the left-right symmetry and 
run only machines starting on the right, and for every string s produced in 
the output data, include the reverse of s with the same (or an increased) 
frequency. 

To restrict the generating machines by imposing these constraints on 
them we created a reduced enumeration that for n states contains 2(n — 
l)(4n + 2) 2ra_1 machines, with the initial transition moving to the right to 
a state different from both the initial and halting states. 

For n = 5 this means running only 4/11 of the total number of Turing 
machines. Moreover, we need the output of those machines starting with a 
0- filled tape and with a l-nlled tape. But we do not run any machine twice, 
as for every machine M producing the binary string s starting with a 1-filled 
tape, there is a 0-1 symmetric machine M' that when starting with a 0-filled 
tape produces the complement to one of s, that is, the result of replacing 
all 0s in s with Is and all Is with 0s. 

After running the 9 658 153 742 machines in the reduced enumeration for 
D(5), we completed the strings generated using the symmetries described. 
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We also counted the number of non-halting machines that were skipped. 

6.2 Detecting non-halting machines 

It is useful to avoid running machines that we can easily determine will not 
stop. These machines will consume the runtime without yielding an out- 
put. As we have shown above, we can avoid generating many non-halting 
machines. In other cases, we can detect them at runtime, by setting ap- 
propriate filters. The theoretical limit of the filters is the halting problem, 
which means that they cannot be exhaustive. But a practical limit is im- 
posed by the difficulty of checking some filters, which takes up more time 
than the runtime that is saved. 

We have employed some filters that have proven to be useful. Briefly, 
these are: 

• Machines without transitions to the halting state. While 
the transition table is being filled, the simulator checks to ascertain 
whether there is some transition to the halting state. If not, it avoids 
running it. 

• Escapees. These are machines that at some stage begin running 
forever in the same direction. As they are always reading new blank 
symbols, as soon as the number of non-previously visited positions is 
greater than the number of states, we know that they will not stop. 

• Cycles of period two. These cycles are easy to detect. They are 
produced when in steps s and s+2 the tape is identical and the machine 
is in the same state and the same position. When this is the case, the 
cycle will be repeated infinitely. 

These filters were implemented in our C++ simulator, which also uses 
the reduced enumeration of Section 6.1. To test them we calculated -D(4) 
with the simulator and compared the output to the list that was computed 
in [13], arriving at exactly the same results, and thereby validating our 
reduction techniques. 

Running D{A) without reducing the enumeration or detecting non- 
halting machines took 952 minutes. Running the reduced enumeration with 
non-halting detectors took 226 minutes. 

6.3 Setting the runtime 

The Busy Beaver for Turing machines with 4 states is known to be 107 steps 
[3], that is, any Turing machine with 2 symbols and 4 states running longer 
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than 107 steps will never halt. However, the exact number is not known for 
Turing machines with 2 symbols and 5 states, although it is believed to be 
47 176 870, as there is a candidate machine that runs for this length of time 
and halts and no machine with a greater runtime has yet been found. 

So we decided to let the machines with 5 states run for 4.6 times the 
Busy Beaver value for 4-state Turing machines (for 107 steps), knowing that 
this would constitute a sample significant enough to capture the behavior 
of Turing machines with 5 states. The chosen runtime was rounded to 500 
steps, which was used to construct the output frequency distribution for 
D(5). 

Not all 5-state Turing machines have been used to build D(5), since only 
the output of machines that halted at or before 500 steps was taken into 
consideration. As an experiment to ascertain how many machines we were 
leaving out, we ran 1.23 x 10 10 random Turing machines for up to 5000 
steps. Among these, only 50 machines halted after 500 steps and before 
5000 (that is, less than 1.75164 x 10 -8 , because in the reduced enumeration 
we don't include those machines that halt in one step or that we know won't 
halt before we generate them, so it's a smaller fraction), with the remain- 
ing 1 496 491 379 machines not halting at 5000 steps. As far as these are 
concerned-and given that the Busy Beaver values for 5 states are unknown- 
we do not know after how many steps they would eventually halt, if they 
ever do. According to the following analysis, our election of a runtime of 
500 steps therefore provides a good estimation of D(5). 

The frequency of runtimes of (halting) Turing machines has theoretically 
been proven to drop exponentially [5], and our experiments are closer to 
the theoretically predicted behavior. To estimate the fraction of halting 
machines that were missed because Turing machines with 5 states were 
stopped after 500 steps, we hypothesize that the number of steps S a random 
halting machine needs before halting is an exponential random variable, 
defined by V/c > 1,P(S = k) oc e~ Xk . We do not have direct access to an 
evaluation of P(S = k), since we only have data for those machines for which 
S < 5000. But we may compute an approximation of P(S = k\S < 5000), 
1 < k < 5000, which is proportional to the desired distribution. 

A non-linear regression using ordinary least-squares gives the approxi- 
mation P(S = k\S < 5000) = ae~ Xk with a = 1.12 and A = 0.793. The 
residual sum-of-squares is 3.392 x 10 -3 ; the number of iterations with start- 
ing values a = 0.4 and A = 0.25 is nine. The model's A is the same A 
appearing in the general law P(S = k), and may be used to estimate the 
number of machines we lose by using a 500 step cut-off point for running 
time: P(k > 500) e - 500A « 6 x 10 -173 . This estimate is far below 
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the point where it could seriously impair our results: the less probable 
(non-impossible) string according to D(5) has an observed probability of 
1.13 x 10~ 9 . 

Although this is only an estimate, it suggests that missed machines are 
few enough to be considered negligible. 

7 Comparison of K m with the number of instruc- 
tions used and Logical Depth 

We now study the relation of K m with the minimal number of instructions 
used by a Turing machine producing a given string, and with Bennett's 
concept of logical depth. As expected, K m shows a correlation with the 
number of instructions used but not with logical depth. 

7.1 Relating K m to the number of instructions used 

First, we are interested in the relation of K m {s) to the minimal number of 
instructions that a Turing machine producing a string s uses. Machines in 
Z)(5, 2) have a transition table with 10 entries, corresponding to the different 
pairs [n,m], with s one of the five states and m either "0" or "1". These 
are the 10 instructions that the machine can use. But for a fixed input 
not all instructions are necessarily used. Then, for a blank tape, not all 
machines that halt use the same number of instructions. The simplest cases 
are machines halting in just one step, that is, machines whose transition 
for (initstate, blank symbol) goes to the halting state, producing a string 
"0" or "1". So the simplest strings produced in D(5, 2) are computed by 
machines using just one instruction. We expected a correlation between the 
-Km-complexity of the strings and the number of instructions used. As we 
show, the following experiment confirmed this. 

We used a sample of 2836 x 10 9 random machines in the reduced enumer- 
ation for D(5, 2), that is, 29% the total number of machines. The output 
of the sample returns the strings produced by halting machines together 
with the number of instructions used, the runtime and the instructions for 
the Turing machine. To save space, we only saved the smallest number 
of instructions found for each string produced, and the smallest runtime 
corresponding to that particular number of instructions. 

After doing the appropriate symmetry completions we have 99 584 dif- 
ferent strings, which is to say almost all the 99 608 strings found in D(5, 2). 
The number of instructions used goes from 1 to 10. When 1 instruction is 



12 




3456789 10 



Figure 1: Distribution of K m values according to the number of instructions 
used. 

used only "0" and "1" are generated, with a K m value of 2.51428. With 
2 instructions, all 2-bit strings are generated, with a K m value of 3.32744. 
For 3 or more instructions, Fig. 1 shows the distribution of values of K m . 
Table 1 shows the mean K m values for the different numbers of instructions 
used. 



Used inst. 


Mean K m 


Mean Length 
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3.32744 
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5.44828 
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8.22809 
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11.4584 
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15.3018 


6.17949 
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20.1167 


7.76515 
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26.0095 


9.99738 
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31.4463 


12.6341 


10 


37.5827 


17.3038 



Table 1: Mean K m and string length for different numbers of instructions 
used. 

This accords with our expectations. Machines using a low number of 
instructions can be repeated many times by permuting the order of states. 
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So the probability of producing their strings is greater, which means low 
K m values. 



50 - 
40 - 



30 - 




6 7 8 1 10 



Figure 2: Instructions used and string lengths. 

We can also look at the relation between the number of instructions used 
and the length of the strings produced. For 1 < i < 5, all strings of length 
i are produced by machines using i instructions. For a greater number of 
instructions used, Fig. 2 shows the distribution of string lengths. Table 1 
shows the mean length for each number of instructions used. 

The correlation rx m ,N = 0.83 is a good indicator for quantifying the ap- 
parent relation between K m and the number N of instructions used, proving 
a strong positive link. However, since the length L of outputs is linked with 
both variables, the partial correlation rx m ,N.L = 0.81 is a better index. This 
value indicates a strong relation between K m and N, even while controlling 
for L. 

7.2 Logical Depth and K m 

As explained above, we have also found that the machines which generate 
each string using the minimum number of instructions also have the mini- 
mum runtime. These runtimes are related to Bennett's logical depth (LD), 
as they are the shortest runtimes of the smallest Turing machines producing 
each string in D(5, 2). 

We have partitioned the runtime space from 1 to 500 (our runtime 
bound) into 20 groups of equal length (25 steps). In order to explore the 



14 



i— . i ■ ■ ■ ■ i ■ ■ ■ ■ i ■ ■ ■ ■ i ■ ■ ■ ■ ' — 

100 200 300 400 500 



Figure 3: LD and K m (min, mean and max values). 
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Figure 4: LD and K m (distribution) 
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relation of K m to Bennett's LD we are interested in the values of K m for the 
strings in each group. Fig. 3 shows the minimum, mean and maximum K m 
values for each of the runtime groups. The same information is in Table 1. 
The distribution of K m values for the different groups is shown in Fig. 4. 
For each interval, the maximum runtime is shown on the horizontal axis. 
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38.4213 
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226-250 


33.9767 


38.3846 
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251-275 


33.3093 


38.4249 


39.0642 


276-300 


33.3363 


38.2785 


39.0642 


301-325 


36.7423 


38.5963 


39.0642 


326-350 


32.8943 


38.2962 


39.0642 


351-375 


32.8163 


38.3742 


39.0642 


376-400 


36.0642 


38.6081 


39.0642 


401-425 


33.2062 


38.4035 


39.0642 


426-450 


33.1100 


38.5543 


39.0642 


451-475 


37.0642 


38.7741 


39.0642 


476-500 


36.0642 


38.6147 


39.0642 



Table 2: Extreme and mean K m values for different runtime intervals. 

We now provide some examples of the discordance between K m and LD. 
"0011110001011" is a string with high K m and low LD. Fig. 5 shows the 
transition table of the smallest machine found producing this string. The 
runtime is low-just 29 steps (of the 99 584 different strings found in our 
sample, only 3 360 are produced in fewer steps), but it uses 10 instructions 
and produces a string with complexity 39.0642. It is the greatest complexity 
we have calculated for K m . Fig. 6 shows the execution of the machine. 

On the other hand, "(10) 20 1" is a string with high LD but low K m value. 
Fig. 7 shows the transition table of the machine found producing this string, 
and Fig. 8 depicts the execution. The machine uses 9 instructions and runs 
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g. g, g, g. g,,g ,g g ,g g. 

States: * (1), «-(2), *(3), > (4), ^(5), •(halting) 

Figure 5: Transition table of a machine producing "0011110001011". 




Figure 6: Execution of the machine producing "0011110001011". 
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for 441 steps (only 710 strings out of the 99 584 strings in our sample require 
more time) but its K m value is 33.11. This is a low complexity if we consider 
that in K m there are 99 608 strings and that 90 842 are more complex than 
this one. 



0^ 



" 4 S ^ I 



States: * (1), w(2), \(3), * (4), ^(5), •(halting) 



Figure 7: Transition table of a machine producing "(10) 20 1" 



Figure 8: Execution of the machine producing "(10) 20 1". 

We may rate the overall strength of the relation between K m and LD by 
the correlation rx m ,LD = 0.41, corresponding to a medium positive link. As 
we previously mentioned however, the fact that the length L of the strings 
is linked with both variables may bias our interpretation. A more relevant 
measure is thus rx m ,LD.L = —0.06, a negative value indicating no significant 
relation between K m and LD once L is controlled. 
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8 Concluding remarks 



As we expected, the Kolmogorov-Chaitin complexity evaluated by means 
of Levin's Coding Theorem from the output distribution of small Turing 
machines correlates with the number of instructions used but not with log- 
ical depth. Logical depth yields a reasonable measure of complexity that 
is different from the measure obtained by considering algorithmic complex- 
ity (K) alone, and this investigation proves that all these three measures 
(Kolmogorov-Chaitin Complexity, Solomonoff-Levin Algorithmic Probabil- 
ity and Bennett's Logic Depth) are numerically approachable, sound and 
consistent with theoretical expectations, and may be used in real-world ap- 
plications. K as a measure of program size is supposed to be an integer (the 
length of a program in bits). K m , however, yields non-integer values. Be- 
cause K m is shown to be a finer measure than the length of Turing machines, 
these results also justify the utility of non-integer values in the approxima- 
tion of the algorithmic complexity of short strings, which also means being 
able to avoid the longer calculations that must be undertaken if only integer 
values were allowed. 

An online tool that we have named the Online Algorithmic Complexity 
Calculator (or OACC) available at http: //www. complexitycalculator . 
com is a long-term project to develop an online tool implementing the semi- 
computable measures of complexity described in this paper, and is expected 
to be expanded in the future. 

It currently implements numerical approximations of Kolmogorov com- 
plexity and Levin's distribution for short binary strings following the numer- 
ical methods described herein, strings for which lossless compression algo- 
rithms fail to approximate their Kolmogorov complexity. Hence it provides 
a complementary and alternative method to compression algorithms. 

The OACC is intended to provide a comprehensive framework of uni- 
versal mathematical measures of randomness, structure and simplicity for 
researchers and professionals. It can be used to provide objective measures 
of complexity in a very wide range of disciplines, from bioinformatics to 
psychometrics, from linguistics to finance. More measures, more data and 
better approximations will be gradually incorporated in the future, covering 
a wider range of objects, such as longer binary strings, non-binary strings 
and n-dimensional arrays (such as images). 
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